Telephone based speaker recognition using multiple binary classifier and Gaussian mixture models

نویسندگان

  • Pierre Castellano
  • Stefan Slomka
  • Sridha Sridharan
چکیده

The present study evaluates MBCM and GMM solutions for both ASV and ASI problems involving text-independent telephone speech from the King speech database. The MBCM's accuracy is enhanced by selectively removing those classi ers within the model which perform worst (pruning). An unpruned MBCM outperforms a GMM for ASV and speakers taken from within the same dialectic region (San Diego, CA). Once pruned, the MBCM is found to be 2.6 times more accurate than the GMM. For closed set ASI, based on the same data, the MBCM is roughly twice as accurate as the GMM but only after pruning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparative Study of Gender and Age Classification in Speech Signals

Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...

متن کامل

Probabilistic Neural Networks Combined with Gmms for Speaker Recognition over Telephone Channels

In this paper we study the applicability of Probabilistic Neural Networks (PNNs) as core classifiers to medium scale speaker recognition over fixed telephone networks. In particular, banking applications with up to 400 enrolled speakers and short training times are targeted. Two PNN-based open-set text-independent systems for Speaker Identification and Speaker Verification correspondingly are p...

متن کامل

Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model

Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....

متن کامل

Robust text-independent speaker identification using Gaussian mixture speaker models

This paper introduces and motivates the use of Gaussian mixture models (CMM) for robust text-independent speaker identification. The individual Gaussian components of a GMM are shown to represent some general speaker-dependent spectral shapes that are efTective for modeling speaker identity. The focus of this work is on applications which require high identification rates using short utterance ...

متن کامل

Speaker Identification Using Gaussian Mixture Models

In this paper, the performance of Perceptual Linear Prediction (PLP) features has been compared with the performance of Linear Prediction Coefficient (LPC) features for speaker identification. Two classification techniques, Gaussian Mixture Models (GMM) and Vector Quantization (VQ) with Dynamic time wrapping (DTW) are used for classification of speakers based on their speech samples into respec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997